580 research outputs found

    ScanEnts3D: Exploiting Phrase-to-3D-Object Correspondences for Improved Visio-Linguistic Models in 3D Scenes

    Full text link
    The two popular datasets ScanRefer [16] and ReferIt3D [3] connect natural language to real-world 3D data. In this paper, we curate a large-scale and complementary dataset extending both the aforementioned ones by associating all objects mentioned in a referential sentence to their underlying instances inside a 3D scene. Specifically, our Scan Entities in 3D (ScanEnts3D) dataset provides explicit correspondences between 369k objects across 84k natural referential sentences, covering 705 real-world scenes. Crucially, we show that by incorporating intuitive losses that enable learning from this novel dataset, we can significantly improve the performance of several recently introduced neural listening architectures, including improving the SoTA in both the Nr3D and ScanRefer benchmarks by 4.3% and 5.0%, respectively. Moreover, we experiment with competitive baselines and recent methods for the task of language generation and show that, as with neural listeners, 3D neural speakers can also noticeably benefit by training with ScanEnts3D, including improving the SoTA by 13.2 CIDEr points on the Nr3D benchmark. Overall, our carefully conducted experimental studies strongly support the conclusion that, by learning on ScanEnts3D, commonly used visio-linguistic 3D architectures can become more efficient and interpretable in their generalization without needing to provide these newly collected annotations at test time. The project's webpage is https://scanents3d.github.io/ .Comment: The project's webpage is https://scanents3d.github.io

    Discrete Contrastive Diffusion for Cross-Modal Music and Image Generation

    Full text link
    Diffusion probabilistic models (DPMs) have become a popular approach to conditional generation, due to their promising results and support for cross-modal synthesis. A key desideratum in conditional synthesis is to achieve high correspondence between the conditioning input and generated output. Most existing methods learn such relationships implicitly, by incorporating the prior into the variational lower bound. In this work, we take a different route -- we explicitly enhance input-output connections by maximizing their mutual information. To this end, we introduce a Conditional Discrete Contrastive Diffusion (CDCD) loss and design two contrastive diffusion mechanisms to effectively incorporate it into the denoising process, combining the diffusion training and contrastive learning for the first time by connecting it with the conventional variational objectives. We demonstrate the efficacy of our approach in evaluations with diverse multimodal conditional synthesis tasks: dance-to-music generation, text-to-image synthesis, as well as class-conditioned image synthesis. On each, we enhance the input-output correspondence and achieve higher or competitive general synthesis quality. Furthermore, the proposed approach improves the convergence of diffusion models, reducing the number of required diffusion steps by more than 35% on two benchmarks, significantly increasing the inference speed.Comment: ICLR 2023. Project at https://github.com/L-YeZhu/CDC

    Intuitive, Interactive Beard and Hair Synthesis with Generative Models

    Full text link
    We present an interactive approach to synthesizing realistic variations in facial hair in images, ranging from subtle edits to existing hair to the addition of complex and challenging hair in images of clean-shaven subjects. To circumvent the tedious and computationally expensive tasks of modeling, rendering and compositing the 3D geometry of the target hairstyle using the traditional graphics pipeline, we employ a neural network pipeline that synthesizes realistic and detailed images of facial hair directly in the target image in under one second. The synthesis is controlled by simple and sparse guide strokes from the user defining the general structural and color properties of the target hairstyle. We qualitatively and quantitatively evaluate our chosen method compared to several alternative approaches. We show compelling interactive editing results with a prototype user interface that allows novice users to progressively refine the generated image to match their desired hairstyle, and demonstrate that our approach also allows for flexible and high-fidelity scalp hair synthesis.Comment: To be presented in the 2020 Conference on Computer Vision and Pattern Recognition (CVPR 2020, Oral Presentation). Supplementary video can be seen at: https://www.youtube.com/watch?v=v4qOtBATrv

    Quantized GAN for Complex Music Generation from Dance Videos

    Full text link
    We present Dance2Music-GAN (D2M-GAN), a novel adversarial multi-modal framework that generates complex musical samples conditioned on dance videos. Our proposed framework takes dance video frames and human body motion as input, and learns to generate music samples that plausibly accompany the corresponding input. Unlike most existing conditional music generation works that generate specific types of mono-instrumental sounds using symbolic audio representations (e.g., MIDI), and that heavily rely on pre-defined musical synthesizers, in this work we generate dance music in complex styles (e.g., pop, breakdancing, etc.) by employing a Vector Quantized (VQ) audio representation, and leverage both its generality and the high abstraction capacity of its symbolic and continuous counterparts. By performing an extensive set of experiments on multiple datasets, and following a comprehensive evaluation protocol, we assess the generative quality of our approach against several alternatives. The quantitative results, which measure the music consistency, beats correspondence, and music diversity, clearly demonstrate the effectiveness of our proposed method. Last but not least, we curate a challenging dance-music dataset of in-the-wild TikTok videos, which we use to further demonstrate the efficacy of our approach in real-world applications - and which we hope to serve as a starting point for relevant future research

    Navigating Scholarly Development Through a CoP: #generational-lenses

    Get PDF
    CoPs provide opportunity for professional identity exploration. Extant research neglects how multiple identities, including generational memberships, influence development. This study explores professional identity development of scholars within a multigenerational CoP

    The α1D-adrenergic receptor is expressed intracellularly and coupled to increases in intracellular calcium and reactive oxygen species in human aortic smooth muscle cells

    Get PDF
    Background: The cellular localization of the α1D-adrenergic receptor (α1D-AR) is controversial. Studies in heterologous cell systems have shown that this receptor is expressed in intracellular compartments. Other studies show that dimerization with other ARs promotes the cell surface expression of the α1D-AR. To assess the cellular localization in vascular smooth muscle cells, we developed an adenoviral vector for the efficient expression of a GFP labeled α1D-AR. We also measured cellular localization with immunocytochemistry. Intracellular calcium levels, measurement of reactive oxygen species and contraction of the rat aorta were used as measures of functional activity. Results: The adenovirally expressed α1D-AR was expressed in intracellular compartments in human aortic smooth muscle cells. The intracellular localization of the α1D-AR was also demonstrated with immunocytochemistry using an α1D-AR specific antibody. RT-PCR analysis detected mRNA transcripts corresponding to the α1A-α1B- and α1D-ARs in these aortic smooth muscle cells. Therefore, the presence of the other α1-ARs, and the potential for dimerization with these receptors, does not alter the intracellular expression of the α1D-AR. Despite the predominant intracellular localization in vascular smooth muscle cells, the α1D-AR remained signaling competent and mediated the phenylephrine-induced increases in intracellular calcium. The α1D-AR also was coupled to the generation of reactive oxygen species in smooth muscle cells. There is evidence from heterologous systems that the α1D-AR heterodimerizes with the β2-AR and that desensitization of the β2-AR results in α1D-AR desensitization. In the rat aorta, desensitization of the β2-AR had no effect on contractile responses mediated by the α1D-AR. Conclusion: Our results suggest that the dimerization of the α1D-AR with other ARs does not alter the cellular expression or functional response characteristics of the α1D-AR

    The α\u3csub\u3e1D\u3c/sub\u3e-Adrenergic Receptor Is Expressed Intracellularly and Coupled to Increases in Intracellular Calcium and Reactive Oxygen Species in Human Aortic Smooth Muscle Cells

    Get PDF
    Background: The cellular localization of the α1D-adrenergic receptor (α1D-AR) is controversial. Studies in heterologous cell systems have shown that this receptor is expressed in intracellular compartments. Other studies show that dimerization with other ARs promotes the cell surface expression of the α1D-AR. To assess the cellular localization in vascular smooth muscle cells, we developed an adenoviral vector for the efficient expression of a GFP labeled α1D-AR. We also measured cellular localization with immunocytochemistry. Intracellular calcium levels, measurement of reactive oxygen species and contraction of the rat aorta were used as measures of functional activity. Results: The adenovirally expressed α1D-AR was expressed in intracellular compartments in human aortic smooth muscle cells. The intracellular localization of the α1D-AR was also demonstrated with immunocytochemistry using an α1D-AR specific antibody. RT-PCR analysis detected mRNA transcripts corresponding to the α1A-α1B- and α1D-ARs in these aortic smooth muscle cells. Therefore, the presence of the other α1-ARs, and the potential for dimerization with these receptors, does not alter the intracellular expression of the α1D-AR. Despite the predominant intracellular localization in vascular smooth muscle cells, the α1D-AR remained signaling competent and mediated the phenylephrine-induced increases in intracellular calcium. The α1D-AR also was coupled to the generation of reactive oxygen species in smooth muscle cells. There is evidence from heterologous systems that the α1D-AR heterodimerizes with the β2-AR and that desensitization of the β2-AR results in α1D-AR desensitization. In the rat aorta, desensitization of the β2-AR had no effect on contractile responses mediated by the α1D-AR. Conclusion: Our results suggest that the dimerization of the α1D-AR with other ARs does not alter the cellular expression or functional response characteristics of the α1D-AR

    R2L: Distilling Neural Radiance Field to Neural Light Field for Efficient Novel View Synthesis

    Full text link
    Recent research explosion on Neural Radiance Field (NeRF) shows the encouraging potential to represent complex scenes with neural networks. One major drawback of NeRF is its prohibitive inference time: Rendering a single pixel requires querying the NeRF network hundreds of times. To resolve it, existing efforts mainly attempt to reduce the number of required sampled points. However, the problem of iterative sampling still exists. On the other hand, Neural Light Field (NeLF) presents a more straightforward representation over NeRF in novel view synthesis -- the rendering of a pixel amounts to one single forward pass without ray-marching. In this work, we present a deep residual MLP network (88 layers) to effectively learn the light field. We show the key to successfully learning such a deep NeLF network is to have sufficient data, for which we transfer the knowledge from a pre-trained NeRF model via data distillation. Extensive experiments on both synthetic and real-world scenes show the merits of our method over other counterpart algorithms. On the synthetic scenes, we achieve 26-35x FLOPs reduction (per camera ray) and 28-31x runtime speedup, meanwhile delivering significantly better (1.4-2.8 dB average PSNR improvement) rendering quality than NeRF without any customized implementation tricks.Comment: Project: https://snap-research.github.io/R2

    Unexpected Learning: Development of the CoP and Its Members #generational-shift

    Get PDF
    Our research explores how multigenerational CoPs may provide graduate students, particularly doctoral students, the space to explore and develop their professional identities and find their scholarly voices
    corecore